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LISTING OF THE CLAIMS 
(including amendments, if any) 

1. (currently amended) A method implemented in a computer system, for clustering a string, 
the string including a plurality of characters, the method including: 

identifying R unique n-grams Ti... R in the string; 
for every unique n-gram T s : 

if a frequency of T s in a set of n-gram statistics is not greater than a first threshold: 
clustering the string with a cluster associated with T s ; 

otherwise: 

for every other n-gram T v in the string Ti... R> except s: 

concluding that the frequency of n-gram Tv is greater than the first threshold, 
and in response: 

if the frequency of an n-gram pair T s -T v is not greater than a second 
threshold: 

clustering the string with a cluster associated with the n-gram pair 
Ts-T v ; 

otherwise: 

for every other n-gram T x in the string Ti... R> eX ce P t s and v: 

clustering the string with a cluster associated with an n-gram 
triple T S -T V -T X; 

where Ti ... R is a set of n-grams, R is the number of elements in Ti... R> and T s , T v , and 
T x are members of Ti R , and S, V, and X are integer indexes to identify 
members of TV..R . 

2. (original) The method of claim 1 further including compiling n-gram statistics. 

3. (original) The method of claim 1 further including compiling n-gram pair statistics. 
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4-5. (cancelled) 

6. (currently amended ) A method implemented in a computer system, for clustering a string, 
the string including a plurality of characters, the method including: 

identifying R unique n-grams Ti ... R in the string; 
for every unique n-gram T s : 

if a frequency of T s in a set of n-gram statistics is not greater than a first threshold: 

clustering the string with a cluster associated with T s ; 
otherwise: 

for i = 1 to Y: 

for every unique set of i n-grams Tu in the string Ti... R> except s: 

if the frequency of the n-gram set T s -Tu is not greater than a second 
threshold: 

clustering the string with a cluster associated with the n-gram set 
Ts-T u; 

if the string has not been associated with a cluster with this value of T s : 
for every unique set of Y+l n-grams Tuy in the string Ti... R eX ce P t s: 

clustering the string with a cluster associated with the Y+2 n-gram 
group Ts-Tuy, 

where Ti... R is a set of n-grams, R is the number of elements in Ti ... R , T s and Tu are 
members of Ti R , T UY is a subset of Ti R , S, V, and X are integer indexes to 
identify members of TY..R and i and Y are integers. 

7. (original) The method of claim 6 where Y = 1. 

8. (original) The method of claim 6 further including compiling n-gram statistics. 

9. (original) The method of claim 6 further including compiling n-gram group statistics. 
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10. (currently amended ) A computer program, stored on a tangible storage medium, for use in 
clustering a string, the program including executable instructions that cause a computer to: 

identify R unique n-grams Ti ... R in the string; 
for every unique n-gram T s : 

if a frequency of T s in a set of n-gram statistics is not greater than a first threshold: 
cluster the string with a cluster associated with T s ; 

otherwise: 

for every other n-gram T v in the string Ti... R> eX ce P t s: 
concluding that the frequency of n-gram T v is greater than the first threshold, 
and in response: 

if the frequency of an n-gram pair T s -T v is not greater than a second 
threshold: 

cluster the string with a cluster associated with the n-gram pair Ts- 
Tv; 

otherwise 

for every other n-gram T x in the string Ti... R> excepts and v: 

cluster the string with a cluster associated with an n-gram 
triple Ts-Ty-Tx; 

where Ti... R is a set of n-grams, R is the number of elements in T L .. R and T s , T v , 
and T x are members of Ti R , and S, V, and X are integer indexes to 
identify members of IY.r . 

11. (original) The computer program of claim 10 further including executable instructions that 
cause a computer to compile n-gram statistics. 

12. (original) The computer program of claim 10 further including executable instructions that 
cause a computer to compile n-gram pair statistics. 
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